The Pivotal Greenplum Full Load process

On the first run of a task the Pivotal Greenplum target writes the data being replicated to CSV files into a folder that is defined for the task. The CSV files are named sequentially, for example, loadNNNN, where NNNN is an incremental number starting from 0. The maximum file size of the CSV file is set by the user when configuring the Pivotal Greenplum database.

When the CSV file reaches its maximum size it is renamed and moved into a load folder. It is then read by the gpfdist utility, which executes an SQL statement that loads the data into the target table. Once the file loading is complete, the file is deleted.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!

Leave your feedback here